A Joint Phrasal and Dependency Model for Paraphrase Alignment
نویسندگان
چکیده
Monolingual alignment is frequently required for natural language tasks that involve similar or comparable sentences. We present a new model for monolingual alignment in which the score of an alignment decomposes over both the set of aligned phrases as well as a set of aligned dependency arcs. Optimal alignments under this scoring function are decoded using integer linear programming while model parameters are learned using standard structured prediction approaches. We evaluate our joint aligner on the Edinburgh paraphrase corpus and show significant gains over a Meteor baseline and a state-of-the-art phrase-based aligner. TITLE AND ABSTRACT IN FRENCH Un modèle de phrases et de dépendances pour l’alignement des paraphrases L’alignement monolingue s’impose fréquemment dans les tâches de langue naturelle qui comprennent des phrases similaires. Nous présentons un nouveau modèle pour l’alignement monolingue dans lequel le score d’un alignement tient compte de l’ensemble de phrases alignées et d’un ensemble d’arcs de dépendance alignés. Cette fonction de score donne des alignements en utilisant l’optimisation linéaire, et nous effectuons l’apprentissage des paramètres du modèle avec des méthodes standardes de prédiction structurée. Nous évaluons notre système mixte par rapport au corpus de paraphrases d’Edinburgh et nous démonstron un avantage significatif par rapport á Meteor et á un système de pointe fondé sur l’alignement des phrases.
منابع مشابه
Inversion Transduction Grammar for Joint Phrasal Translation Modeling
We present a phrasal inversion transduction grammar as an alternative to joint phrasal translation models. This syntactic model is similar to its flatstring phrasal predecessors, but admits polynomial-time algorithms for Viterbi alignment and EM training. We demonstrate that the consistency constraints that allow flat phrasal models to scale also help ITG algorithms, producing an 80-times faste...
متن کاملDependency Treelet Translation: Syntactically Informed Phrasal SMT
We describe a novel approach to statistical machine translation that combines syntactic information in the source language with recent advances in phrasal translation. This method requires a source-language dependency parser, target language word segmentation and an unsupervised word alignment component. We align a parallel corpus, project the source dependency parse onto the target sentence, e...
متن کاملParaphrase Alignment for Synonym Evidence Discovery
We describe a new unsupervised approach for synonymy discovery by aligning paraphrases in monolingual domain corpora. For that purpose, we identify phrasal terms that convey most of the concepts within domains and adapt a methodology for the automatic extraction and alignment of paraphrases to identify paraphrase casts from which valid synonyms are discovered. Results performed on two different...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملA Probabilistic Model for Measuring Grammaticality and Similarity of Automatically Generated Paraphrases of Predicate Phrases
The most critical issue in generating and recognizing paraphrases is development of wide-coverage paraphrase knowledge. Previous work on paraphrase acquisition has collected lexicalized pairs of expressions; however, the results do not ensure full coverage of the various paraphrase phenomena. This paper focuses on productive paraphrases realized by general transformation patterns, and addresses...
متن کامل